Voice Biometrics under Mismatched Noise Conditions
نویسنده
چکیده
Voice biometrics under mismatched noise conditions i ABSTRACT This thesis describes research into effective voice biometrics (speaker recognition) under mismatched noise conditions. Over the last two decades, this class of biometrics has been the subject of considerable research due to its various applications in such areas as telephone banking, remote access control and surveillance. One of the main challenges associated with the deployment of voice biometrics in practice is that of undesired variations in speech characteristics caused by environmental noise. Such variations can in turn lead to a mismatch between the corresponding test and reference material from the same speaker. This is found to adversely affect the performance of speaker recognition in terms of accuracy. To address the above problem, a novel approach is introduced and investigated. The proposed method is based on minimising the noise mismatch between reference speaker models and the given test utterance, and involves a new form of TestNormalisation (T-Norm) for further enhancing matching scores under the aforementioned adverse operating conditions. Through experimental investigations, based on the two main classes of speaker recognition (i.e. verification/ open-set identification), it is shown that the proposed approach can significantly improve the performance accuracy under mismatched noise conditions. In order to further improve the recognition accuracy in severe mismatch conditions, an approach to enhancing the above stated method is proposed. This, which involves providing a closer adjustment of the reference speaker models to the noise condition in the test utterance, is shown to considerably increase the accuracy in extreme cases of noisy test data. Moreover, to tackle the computational burden associated with the use of the enhanced approach with open-set identification, an efficient algorithm for its realisation in this context is introduced and evaluated. The thesis presents a detailed description of the research undertaken, describes the experimental investigations and provides a thorough analysis of the outcomes.
منابع مشابه
On combining multi-normalization and ancillary measures for the optimal score level fusion of fingerprint and voice biometrics
In this paper, we have considered the utility of multi-normalization and ancillary measures, for the optimal score level fusion of fingerprint and voice biometrics. An efficient matching score preprocessing technique based on multi-normalization is employed for improving the performance of the multimodal system, under various noise conditions. Ancillary measures derived from the feature space a...
متن کاملThe effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications.
In this paper, we analyse mismatched technical conditions in training and testing phases of speaker recognition and their effect on forensic human and automatic speaker recognition. We use perceptual tests performed by non-experts and compare their performance with that of a baseline automatic speaker recognition system. The degradation of the accuracy of human recognition in mismatched recordi...
متن کاملAn Efficient Pso Optimized Integration Weight Estimation Using D-prime Statistics for a Multibiometric System
The efficiency of a multibiometric system can be improved by weighting the scores obtained from the degraded modalities in an appropriate manner. In this paper, we propose an efficient PSO (Particle Swarm Optimization) based integration weight optimization scheme using d-prime statistics to determine the optimal weight factors for the complementary modalities, under different noise conditions. ...
متن کاملVoice Biometrics for User Authentication
Voice biometrics for user authentication is a task in which the goal is to perform convenient, robust and secure authentication of speakers. In this work we investigate the use of state-of-theart text-independent and text-dependent speaker verification technology for user authentication. We evaluate three different authentication conditions: global digit strings, speaker specific digit stings a...
متن کاملAssessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched train and test conditions. In this study, we report on evaluation of four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white at five ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011